feat: index numerical and date fields in Solr with appropriate types + more targeted search result highlighting #10887
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it:
Currently, all fields regardless of type are indexed in Solr as English text (
text_en
). With this PR, numerical and date fields are indexed in Solr with appropriate types:int
plong
float
pdouble
date
date_range
(solr.DateRangeField
)I chose to index dates as
DateRangeField
because they can be used to represent dates to any precision, e.g. a day YYYY-MM-DD, a month YYYY-MM or a year YYYY. See: Date Formatting and Date Math :: Apache Solr Reference GuideThis matches the allowed formats in a date field as defined by Dataverse.
This means that range queries are now possible on numerical and date fields, e.g.
exampleIntegerField:[25 TO 50]
orexampleDateField:[2000-11-01 TO 2014-12-01]
.Which issue(s) this PR closes:
This PR implements ranged queries as discussed in #370 (issue was already closed)
This issue is related to #8813 and IQSS/dataverse-frontend#278 (the range queries that are now possible lay the groundwork for a nicer search facet UI)
Special notes for your reviewer:
For testing, I've created a sample TSV containing all relevant fields here.
Suggestions on how to test this:
exampleIntegerField:[25 TO 50]
orexampleDateField:[2000-11-01 TO 2014-12-01]
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Facets still look the same as before. There is only a small change in the highlighting of search results, see my comment below
Is there a release notes update needed for this change?:
Yes, there should be an info text describing the new feature + instructions for how to activate the feature:
Additional documentation:
/